Back

Mobile DNA

Springer Science and Business Media LLC

All preprints, ranked by how well they match Mobile DNA's content profile, based on 27 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.

1
Systematic annotation of Helitron-like elements in eukaryote genomes using HELIANO

Li, Z.; Gilbert, C.; Peng, H.; Pollet, N.

2024-02-09 evolutionary biology 10.1101/2024.02.08.579435 medRxiv
Top 0.1%
63.0%
Show abstract

Helitron-like elements (HLEs) are widespread eukaryotic DNA transposons employing a rolling-circle transposition mechanism. Despite their prevalence in fungi, animals, and plant genomes, identifying Helitrons remains challenging. We introduce HELIANO, a software for annotating and classifying autonomous and non-autonomous Helitron and Helentron sequences from whole genomes. HELIANO outperforms existing tools in speed and accuracy, demonstrated through benchmarking and its application to complex genomes (Xenopus tropicalis, Xenopus laevis, Oryza sativa), revealing numerous newly identified Helitrons and Helentrons. In a comprehensive analysis of 404 eukaryote genomes, we found HLEs widely distributed across phyla, with exceptions in specific taxa. Helentrons were identified in numerous land plant species, and 20 protein domains were discovered integrated within specific autonomous HLE families. A global phylogenetic analysis confirmed the classification into main clades Helentron and Helitron, revealing nine subgroups, some enriched in particular taxa. The future use of HELIANO will contribute to the global analysis of TEs across genomes and enhance our understanding of this transposon superfamily.

2
Functional validation of transposable element derived cis-regulatory elements in Atlantic salmon

Sahlstrom, H. M.; Datsomor, A. K.; Monsen, O.; Hvidsten, T. R.; Sandve, S. R.

2022-11-03 evolutionary biology 10.1101/2022.11.02.514921 medRxiv
Top 0.1%
61.1%
Show abstract

BackgroundTransposable elements (TEs) are hypothesized to play important roles in shaping genome evolution following whole genome duplications (WGD), including rewiring of gene regulation. In a recent analysis, duplicate gene copies that had evolved higher expression in liver following the salmonid WGD ~100 million years ago were associated with higher numbers of predicted TE-derived cis-regulatory elements (TE-CREs). Yet, the ability of these TE-CREs to recruit transcription factors (TFs) in vivo and impact gene expression remains unknown. ResultsHere, we evaluated the gene regulatory functions of 11 TEs using luciferase promoter reporter assays in Atlantic salmon (Salmo salar) primary liver cells. Canonical Tc1-Mariner elements from intronic regions showed no or small repressive effects on transcription. However, other TE-derived cis-regulatory elements upstream of transcriptional start sites increased expression significantly. ConclusionOur results question the hypothesis that TEs in the Tc1-Mariner superfamily, which were extremely active following WGD in salmonids, had a major impact on regulatory rewiring of gene duplicates, but highlights the potential of other TEs in post-WGD rewiring of gene regulation in the Atlantic salmon genome.

3
Hide and seek: de novo identification in sugar beet reveals impact of non-autonomous LTR retrotransposons

Maiwald, S.; Maiwald, F.; Heitkam, T.

2026-03-03 genomics 10.64898/2026.03.01.708851 medRxiv
Top 0.1%
52.3%
Show abstract

Plant genomes are filled with retrotransposons and their derivatives, subject to constant sequence turnover. As short, non-autonomous retrotransposons do not encode a protein product, they experience reduced selective constraints on their DNA sequence, leading to diversification into multiple families, usually limited to only a few species. This absence of any coding capacity and their tendency to form subfamilies are the reasons for the incomplete description of non-autonomous LTR retrotransposons in most to all genomic repeat annotations. Here, we focus on non-autonomous LTR retrotransposon identification. Are all of these sequences derivatives of easier-to-identify full-length elements? Or is there more variability, which is currently overlooked? For this, we capitalize on our comprehensive understanding of the TE landscape in sugar beet to assess the extent of the blind spot on non-autonomous LTR retrotransposons Here, we present a workflow to identify non-autonomous LTR retrotransposons without prior sequence information, retrieving more than 100 families within the sugar beet genome. We only include TEs without the ability for complete self mobilization. Spanning up to 15,000 bp, these non-autonomous families are often longer than expected and characterized by reshuffling and modular evolution. Most strikingly, only a few of these families are directly derived from autonomous partners, showing that there is a large, undiscovered TE variety in the non-autonomous TE fraction. We highlight that a large fraction of non-autonomous TEs wont be retrieved with the current TE identification workflows, even if the output is well-curated and condensed into TE libraries and suggest procedures to remedy this gap. This study is the first insight into the non-autonomous LTR retrotransposon landscape within a single genome and serves as an example to estimate the error in non-autonomous TE detection.

4
Targeted nanopore resequencing and methylation analysis of LINE-1 retrotransposons

Sarkar, A.; Lanciano, S.; Cristofari, G.

2022-06-29 genomics 10.1101/2022.06.25.497594 medRxiv
Top 0.1%
42.8%
Show abstract

Retrotransposition of LINE-1 (L1) elements represent a major source of insertional polymorphisms in mammals and their mutagenic activity is restricted by silencing mechanisms, such as DNA methylation. Despite a very high level of sequence identity between copies, their internal sequence contains small nucleotide polymorphisms (SNPs) that can alter their activity. Such internal SNPs can also appear in different alleles of a given L1 locus. Given their repetitive nature and relatively long size, short-read sequencing approaches have limited access to L1 internal sequence or DNA methylation state. Here we describe a targeted method to specifically sequence more than a hundred L1-containing loci in parallel and measure their DNA methylation levels using nanopore long-read sequencing. Each targeted locus is sequenced at high coverage ([~]45X) with unambiguously mapped reads spanning the entire L1 element, as well as its flanking sequences over several kilobases. Our protocol, modified from the nanopore Cas9 targeted sequencing (nCATS) strategy, provides a full and haplotype-resolved L1 sequence and DNA methylation levels. It introduces a streamlined and multiplex approach to synthesize guide RNAs and a quantitative PCR (qPCR)-based quality check during library preparation for cost-effective L1 sequencing. More generally, this method can be applied to any type of transposable elements and organisms.

5
CUT&Tag Identifies Repetitive Genomic Loci that are Excluded from ChIP Assays

Park, B. J.; Hua, S.; Casler, K. D.; Cefaloni, E.; Ayers, M. C.; Lake, R. F.; Murphy, K. E.; Vertino, P. M.; O'Connell, M. R.; Murphy, P. J.

2025-02-05 genomics 10.1101/2025.02.03.636299 medRxiv
Top 0.1%
41.6%
Show abstract

Determining the genomic localization of chromatin features is an essential aspect of investigating gene expression control, and ChIP-Seq has long been the gold standard technique for interrogating chromatin landscapes. Recently, the development of alternative methods, such as CUT&Tag, have provided researchers with alternative strategies that eliminate the need for chromatin purification, and allow for in situ investigation of histone modifications and chromatin bound factors. Mindful of technical differences, we set out to investigate whether distinct chromatin modifications were equally compatible with these different chromatin interrogation techniques. We found that ChIP-Seq and CUT&Tag performed similarly for modifications known to reside at gene regulatory regions, such as promoters and enhancers, but major differences were observed when we assessed enrichment over heterochromatin-associated loci. Unlike ChIP-Seq, CUT&Tag detects robust levels of H3K9me3 at a substantial number of repetitive elements, with especially high sensitivity over evolutionarily young retrotransposons. IAPEz-int elements for example, exhibited underrepresentation in mouse ChIP-Seq datasets but strong enrichment using CUT&Tag. Additionally, we identified several euchromatin-associated proteins that co-purify with repetitive loci and are similarly depleted when applying ChIP-based methods. This study reveals that our current knowledge of chromatin states across the heterochromatin portions of the mammalian genome is extensively incomplete, largely due to limitations of ChIP-Seq. We also demonstrate that newer in situ chromatin fragmentation-based techniques, such as CUT&Tag and CUT&RUN, are more suitable for studying chromatin modifications over repetitive elements and retrotransposons. HighlightsIn situ fragmentation overcomes biases produced by ChIP-Seq. Heterochromatic regions of the genome are lost to the insoluble pellet during ChIP-Seq. CUT&Tag allows for mapping chromatin features at young repetitive elements. Euchromatin-associated regulatory factors co-purify with insoluble heterochromatin.

6
Annotation of piRNA source loci in the genome of non-model insects

Halbach, R.; van Rij, R. P.

2024-08-15 molecular biology 10.1101/2024.08.15.608080 medRxiv
Top 0.1%
40.2%
Show abstract

The PIWI-interacting RNA (piRNA) pathway plays a crucial role in the defense of metazoan genomes against parasitic transposable elements. The major source of piRNAs in the model organism Drosophila melanogaster are defective transposon copies located in piRNA clusters - genomic regions with a high piRNA density that are thought to serve as an immunological memory of past invasion by those elements. Different approaches have been used to annotate piRNA clusters in model organisms like flies, mice and rats, and software such as proTRAC or piClust are available for piRNA cluster annotation. However, these software often make assumptions based on current knowledge of piRNA clusters from (mostly vertebrate) model organisms, which do not necessarily hold true for non-model insects in which the piRNA pathway is less understood. Here we describe a simple piRNA cluster annotation approach that utilizes very little assumptions on the biology of the piRNA pathway. The pipeline has been validated on mosquito genomes but can be easily used for other non-model insect species as well.

7
Introns are derived from transposons

Rogers, S. O.; Bendich, A. J.

2023-02-22 evolutionary biology 10.1101/2023.02.21.529479 medRxiv
Top 0.1%
37.7%
Show abstract

Introns and transposons exhibit many similar features, but the connections between them have yet to be firmly established. Group I introns have commonalities with DNA transposons, while group II introns share many features with retrotransposons. Here, we report the results of an analysis of 214 introns (including group I, group II, group III, twintrons, spliceosomal, and archaeal introns) from members of seven major taxa (within Eukarya, Bacteria, and Archaea) that all have direct repeats at or near both exon/intron borders, indicating that they were inserted via transposition events. Border sequence analysis indicates that after splicing, most mature transcripts would be functionally compromised because they do not restore the DNA sequence information before intron insertion. Transposons and introns thus appear to be members of a diverse assemblage of parasitic mobile genetic elements that secondarily may benefit their host cell and have expanded greatly in eukaryotes from their presumed prokaryotic ancestors. Author SummaryIntrons are found in all domains of life. While they are limited in prokaryotes, they have greatly expanded in number and diversity in eukaryotes. We found direct repeat sequences at or near both exon/intron borders for all 214 introns analyzed among eukaryotes, bacteria, and archaea. We infer that all introns were inserted into genes via transposon-like mechanisms and are members of a large family of mobile genetic elements.

8
Accumulation of a biparentally-inherited Neptune transposable element in natural Killifish hybrids (Fundulus diaphanus X F. heteroclitus)

Roussel, A.-J.; Suh, A.; Ruiz-Ruano, F. J.; Dion-Cote, A.-M.

2025-09-10 evolutionary biology 10.1101/2025.05.22.655539 medRxiv
Top 0.1%
33.0%
Show abstract

Transposable elements (TEs) are abundant selfish genetic elements that can mobilize in their host genome, causing DNA damage, mutations and chromosome rearrangements. TE silencing is thus critical, and is initiated by maternally loaded piRNAs, leading to their repression. Consistently, paternally inherited TEs are derepressed in the progeny of Drosophila crosses involving a naive female. TEs have also been found to be derepressed in interspecific crosses, which is proposed to result from suboptimal interactions of piRNA pathway proteins. Fundulus heteroclitus and F. diaphanus hybridize in nature and produce viable and fertile offspring that sometimes reproduce asexually. We characterized the repetitive DNA content of these species and their asexually reproducing hybrids. TE load was slightly higher than expected in hybrids and associated with younger repeats. Two bi-parentally inherited active Neptune subfamilies showed a remarkable [~]3-4-fold accumulation in hybrids. These results are consistent with suboptimal piRNA pathway function, leading to active TE accumulation.

9
The epigenetics effects of transposable elements are context dependent and not restricted to gene silencing

Coronado-Zamora, M.; Gonzalez, J.

2023-11-27 evolutionary biology 10.1101/2023.11.27.568862 medRxiv
Top 0.1%
31.5%
Show abstract

Transposable elements (TEs) represent a threat to genome integrity due to their proliferation capacity. Eukaryotic cells silence TEs through different epigenetic mechanisms, including the deposition of repressive histone marks. Previous studies have shown that repressive marks can spread to neighboring sequences. However, evidence for this spreading affecting nearby gene expression remains limited. Similarly, whether TEs induce changes in the enrichment of active histone marks genome-wide, and its potential impact on gene expression have not been widely studied. In this work, we performed a comprehensive study of the epigenetic effects of 2,235 TEs and their potential effects on nearby gene expression on D. melanogaster head, gut and ovary. While most of the TEs (816) induce the enrichment of the H3K9me3 repressive mark, with stronger epigenetic effects in the ovary, a substantial number (345 TEs) induce the enrichment of the H3K27ac active mark, particularly in the gut. We found that 70% of the H3K9me3 enriched TEs induced gene down-regulation, and 50% of the H3K27ac enriched TEs induced gene up-regulation. These changes in expression affect specific regulatory networks in head and gut while in ovary, genes were not enriched for any biological functions. Furthermore, TE epigenetic effects on gene expression are genomic context dependent. Finally, we found that TEs also affect gene expression by disrupting regions enriched for histone marks. Overall, our results show that TEs do generate regulatory novelty through epigenetic changes, with these epigenetic effects not restricted to gene silencing and being context dependent. Significance statementTransposable elements (TEs) are repetitive DNA sequences found in nearly all studied organisms that have the capacity to move within the genome. To prevent their proliferation, eukaryotic cells target TEs with repressive histone marks, an epigenetic signal that blocks their expression. While these repressive marks can spread to neighboring genes, the evidence of how this impacts gene expression is limited. Similarly, whether TEs also influence the enrichment and depletion of active histone marks and their genome-wide impact is not understood. In this work, we studied the histone mark enrichment of 2,235 polymorphic TEs across three body parts of D. melanogaster. Our results provide evidence for the genome-wide role of TEs in the generation of regulatory novelty through epigenetic changes.

10
Revisiting the impact of synthetic ORF sequences on engineered LINE-1 retrotransposition

Richardson, S. R.; Chan, D.; Gerdes, P.; Han, J. S.; Boeke, J. D.; Faulkner, G. J.

2022-08-29 molecular biology 10.1101/2022.08.29.505632 medRxiv
Top 0.1%
29.1%
Show abstract

The retrotransposon Long Interspersed Element 1 (L1) contains adenosine rich ORFs, a characteristic that limits its expression in mammalian cells. A synthetic mouse L1 (smL1) with ORF adenosine content decreased from 40% to 26% showed increased mRNA expression and retrotransposed far more efficiently than the native parental element, L1spa (1). Here, we observe two nonsynonymous substitutions between the L1spa and smL1 ORF1 sequences, and note that the smL1 3UTR lacks a conserved guanosine-rich region (GRR) which could potentially take on a G-quadruplex secondary structure. We find that the combined effect of a single amino acid change and the GRR 3UTR deletion, rather than synthetic ORF sequences, accounts for the increase in smL1 retrotransposition efficiency over L1spa. Furthermore, we demonstrate that the position of the GRR within the L1 reporter construct impacts retrotransposition efficiency. Our results prompt a reevaluation of synthetic L1 activity and suggest native mouse L1 mobility has in some cases been underestimated in engineered retrotransposition assays. Author SummaryL1 retrotransposons are mobile DNA elements or "jumping genes" that can copy- and-paste their sequences to new locations in the host genome. The jumping ability, or retrotransposition efficiency, of individual L1 elements can be evaluated using a cultured cell assay in which the L1 is tagged in its 3 untranslated region (3UTR) with a reporter gene that becomes expressed upon successful retrotransposition. In a previous study, authors Han and Boeke reported that the retrotransposition efficiency of a mouse L1 element could be enhanced dramatically by synthetically increasing the GC content of the L1 ORFs without changing their amino acid sequence. Curiously, a similarly constructed synthetic human L1 achieved only a modest increase in retrotransposition efficiency over the native element. Here, we find that two coding changes and partial deletion of the mouse L1 3UTR sequence which occurred during construction of the synthetic mouse L1 reporter actually are responsible for the increased jumping of this construct. We also find that changing the placement as well as the presence of this deleted 3UTR region within the reporter construct determines its impact on engineered retrotransposition efficiency. Together, our study reconciles the disparate impacts of synthetic sequences upon human and mouse L1 retrotransposition efficiency, prompts a reconsideration of numerous studies using synthetic L1 constructs, and will inform the ongoing use of synthetic and natural mouse L1 reporter constructs in vivo and in vitro.

11
Exploring Alu-Driven DNA Transductions in the Primate Genomes

Halabian, R.; Storer, J. M.; Hoyt, S. J.; Hartley, G. A.; Brosius, J.; O'Neill, R. J.; Makalowski, W.

2024-04-30 genetics 10.1101/2024.04.29.591526 medRxiv
Top 0.1%
28.6%
Show abstract

Long terminal repeats (LTRs) and non-LTRs retrotransposons, aka retroelements, collectively occupy a substantial part of the human genome. Certain non-LTR retroelements, such as L1 and SVA, have the potential for DNA transduction, which involves the concurrent mobilization of flanking non-transposon DNA during retrotransposition. These events can be detected by computational approaches. Despite being the most abundant short interspersed sequences (SINEs) that are still active within the genomes of humans and other primates, the transduction rate caused by Alu sequences remains unexplored. Therefore, we conducted an analysis to address this research gap and utilized an in-house program to probe for the presence of Alu-related transductions in the human genome. We analyzed 118,489 full-length AluY subfamilies annotated within the first complete human reference genome, T2T-CHM13. For comparative insights, we extended our exploration to two non-human primate genomes, the chimpanzee and the rhesus monkey. After manual curation, our findings did not confirm any Alu-mediated transductions, whose source genes are, unlike L1 or SVA, transcribed by RNA polymerase III, implying that they are infrequent or possibly absent not only in the human but also in chimpanzee and rhesus monkey genomes. Although we identified loci in which the 3 Target Site Duplication (TSD) was located distantly from the retrotransposed AluYs, a transduction hallmark, our study could not find further support for such events. The observation of these instances can be explained by the incorporation of other nucleotides into the poly(A) tails in conjunction with polymerase slippage.

12
Atypical landscape of transposable elements in the large genome of Aedes aegypti

Daron, J.; Bergman, A.; Lopez-Maestre, H.; LAMBRECHTS, L.

2024-02-08 evolutionary biology 10.1101/2024.02.07.579293 medRxiv
Top 0.1%
27.0%
Show abstract

Transposable elements (TEs) contribute significantly to variation in genome size among eukaryotic species, but the factors influencing TE accumulation and diversification are only partially understood. Most of our current knowledge about TE organization, dynamics and evolution derives from investigations in model organisms with a relatively small genome size such as Drosophila melanogaster or Arabidopsis thaliana. Whether the observed patterns hold true in larger genomes remains to be determined. The Diptera order is an ideal taxon to address this question, because it includes a forty-year model of TE biology (D. melanogaster) as well as mosquito species with significantly larger genomes. Here, we use a comparative genomics approach to characterize the genomic forces that have shaped the TE content of the Aedes aegypti genome (1.3 Gb) relative to the Anopheles coluzzii genome (300 Mb) and the D. melanogaster genome (180 Mb). Leveraging a newly developed high-quality TE library for Ae. aegypti, our results reveal a contrasted pattern of TE organization in Ae. aegypti compared to An. coluzzii and D. melanogaster. Our analyses suggest that the substantial TE fraction observed in the Ae. aegypti genome reflect both a high rate of TE transposition and a low rate of TE elimination. Together, our results indicate that TE organization and evolutionary dynamics in the large genome of Ae. aegypti are distinct from those of other dipterans with smaller genomes.

13
Dosage compensation of transposable elements in mammals

Wei, C.; Kesner, B.; Weissbein, U.; Wasserzug-Pash, P.; Das, P.; Lee, J. T.

2024-12-18 genetics 10.1101/2024.12.16.628797 medRxiv
Top 0.1%
25.9%
Show abstract

In mammals, X-linked dosage compensation involves two processes: X-chromosome inactivation (XCI) to balance X chromosome dosage between males and females, and hyperactivation of the remaining X chromosome (Xa-hyperactivation) to achieve X-autosome balance in both sexes. Studies of both processes have largely focused on coding genes and have not accounted for transposable elements (TEs) which comprise 50% of the X-chromosome, despite TEs being suspected to have numerous epigenetic functions. This oversight is due in part to the technical challenge of capturing repeat RNAs, bioinformatically aligning them, and determining allelic origin. To overcome these challenges, here we develop a new bioinformatic pipeline tailored to repetitive elements with capability for allelic discrimination. We then apply the pipeline to our recent So-Smart-Seq analysis of single embryos to comprehensively interrogate whether X-linked TEs are subject to either XCI or Xa-hyperactivation. With regards to XCI, we observe significant differences in TE silencing in parentally driven "imprinted" XCI versus zygotically driven "random" XCI. Chromosomal positioning and genetic background impact TE silencing. We also find that SINEs may influence 3D organization during XCI. In contrast, TEs do not undergo Xa-hyperactivation. Thus, while coding genes are subject to both forms of dosage compensation, TEs participate only in Xi silencing. Evolutionary and functional implications are discussed.

14
MCHelper automatically curates transposable element libraries across species

Orozco, S.; Sierra, P.; Durbin, R.; Gonzalez, J.

2023-10-20 genomics 10.1101/2023.10.17.562682 medRxiv
Top 0.1%
22.8%
Show abstract

The number of species with high quality genome sequences continues to increase, in part due to scaling up of multiple large scale biodiversity sequencing projects. While the need to annotate genic sequences in these genomes is widely acknowledged, the parallel need to annotate transposable element sequences that have been shown to alter genome architecture, rewire gene regulatory networks, and contribute to the evolution of host traits is becoming ever more evident. However, accurate genome-wide annotation of transposable element sequences is still technically challenging. Several de novo transposable element identification tools are now available, but manual curation of the libraries produced by these tools is needed to generate high quality genome annotations. Manual curation is time-consuming, and thus impractical for large-scale genomic studies, and lacks reproducibility. In this work, we present the Manual Curator Helper tool MCHelper, which automates the TE library curation process. By leveraging MCHelpers fully automated mode with the outputs from three de novo transposable element identification tools, RepeatModeler2, EDTA and REPET, in fruit fly, rice, hooded crow, zebrafish, maize, and human, we show a substantial improvement in the quality of the transposable element libraries and genome annotations. MCHelper libraries are less redundant, with up to 65% reduction in the number of consensus sequences, have up to 11.4% fewer false positive sequences, and up to [~]48% fewer "unclassified/unknown" transposable element consensus sequences. Genome-wide transposable element annotations were also improved, including larger unfragmented insertions. Moreover, MCHelper is an easy to install and easy to use tool.

15
ATHILAfinder: a tool to detect ATHILA LTR retrotransposons in plant genomes

Bousios, A.; Primetis, E.

2026-03-22 bioinformatics 10.64898/2026.03.20.713144 medRxiv
Top 0.1%
22.8%
Show abstract

MotivationThe ATHILA lineage of LTR retrotransposons has colonised all branches of the plant tree of life. In Arabidopsis thaliana and A. lyrata, ATHILA elements have invaded centromeres, influencing the genetic and epigenetic organisation, and driving satellite evolution. To assess the broader significance of ATHILA across plants, a computational pipeline is needed to identify ATHILA elements with high efficiency. Existing tools lack this ability because they are optimised for broad transposon classification at the expense of precise annotation of lower taxonomic levels. ResultsWe present ATHILAfinder, a pipeline for accurate and large-scale discovery of ATHILA elements. ATHILAfinder uses lineage-specific sequence motifs as seeds and additional filters to build de novo intact elements. Homology-based steps rescue intact ATHILA and identify soloLTRs. A detailed identity card includes coordinates, LTR identity, coding capacity, length and other sequence features for every ATHILA. We validate ATHILAfinder in the A. thaliana Col-CEN assembly and five additional Brassicaceae species, covering four supertribes and [~]30 million years of evolution. ATHILAfinder has very low false positive rates and outperforms widely-used tools like EDTA and the deep-learning-based Inpactor2 software for both recovery and precision of ATHILA. To demonstrate its usefulness, we generate insights into ATHILA dynamics across Brassicaceae. OutlookFew computational pipelines target specific transposon lineages, yet such tools can empower their identification and downstream analyses. Our tailored approach can be adapted to other LTR retrotransposon lineages, offering new ways for high-resolution analysis of transposons.

16
A programmable seeker RNA guides target selection by IS1111 and IS110 type insertion sequences.

Siddiquee, R.; Pong, C. H.; Hall, R. M.; Ataide, S. F.

2024-04-27 microbiology 10.1101/2024.04.26.591405 medRxiv
Top 0.1%
22.7%
Show abstract

IS1111 and IS110 insertion sequence (IS) family members encode an unusual DEDD transposase type and exhibit specific target site selection. The IS1111 group include identifiable subterminal inverted repeats (sTIR) not found in the IS110 type [1]. IS in both families include a noncoding region (NCR) of significant length and, as each individual IS or group of closely related IS selects a different site, we had previously proposed that an NCR-derived RNA was involved in target selection [2]. Here, we found that the NCR is usually downstream of the transposase gene in IS1111 family IS and upstream in the IS110 type. Four IS1111 and one IS110 family members that target different sequences were used to demonstrate that the NCR determines a short seeker RNA (seekRNA) that co-purified with the transposase. The seekRNA was essential for transposition of the IS or a cargo flanked by IS ends from and to the preferred target. Short sequences matching both top and bottom strands of the target were identified in the seekRNA but their order in IS1111 and IS110 family IS was reversed. Reprogramming the seekRNA and donor flank to target a different site was demonstrated, indicating future biotechnological potential for these systems.

17
Identification and characterization of retro-DNAs, a new type of retrotransposons originated from DNA transposons, in primate genomes

Tang, W.; Liang, P.

2020-03-20 evolutionary biology 10.1101/2020.03.19.999144 medRxiv
Top 0.1%
22.7%
Show abstract

Mobile elements (MEs) can be divided into two major classes based on their transposition mechanisms as retrotransposons and DNA transposons. DNA transposons move in the genomes directly in the form of DNA in a cut-and-paste style, while retrotransposons utilize an RNA-intermediate to transpose in a "copy-and-paste" fashion. In addition to the target site duplications (TSDs), a hallmark of transposition shared by both classes, the DNA transposons also carry terminal inverted repeats (TIRs). DNA transposons constitute ~3% of primate genomes and they are thought to be inactive in the recent primate genomes since ~37My ago despite their success during early primate evolution. Retrotransposons can be further divided into Long Terminal Repeat retrotransposons (LTRs), which are characterized by the presence of LTRs at the two ends, and non-LTRs, which lack LTRs. In the primate genomes, LTRs constitute ~9% of genomes and have a low level of ongoing activity, while non-LTR retrotransposons represent the major types of MEs, contributing to ~37% of the genomes with some members being very young and currently active in retrotransposition. The four known types of non-LTR retrotransposons include LINEs, SINEs, SVAs, and processed pseudogenes, all characterized by the presence of a polyA tail and TSDs, which mostly range from 8 to 15 bp in length. All non-LTR retrotransposons are known to utilize the L1-based target-primed reverse transcription (TPRT) machineries for retrotransposition. In this study, we report a new type of non-LTR retrotransposon, which we named as retro-DNAs, to represent DNA transposons by sequence but non-LTR retrotransposons by the transposition mechanism in the recent primate genomes. By using a bioinformatics comparative genomics approach, we identified a total of 1,750 retro-DNAs, which represent 748 unique insertion events in the human genome and nine non-human primate genomes from the ape and monkey groups. These retro-DNAs, mostly as fragments of full-length DNA transposons, carry no TIRs but longer TSDs with ~23.5% also carrying a polyA tail and with their insertion site motifs and TSD length pattern characteristic of non-LTR retrotransposons. These features suggest that these retro-DNAs are DNA transposon sequences likely mobilized by the TPRT mechanism. Further, at least 40% of these retro-DNAs locate to genic regions, presenting significant potentials for impacting gene function. More interestingly, some retro-DNAs, as well as their parent sites, show certain levels of current transcriptional expression, suggesting that they have the potential to create more retro-DNAs in the current primate genomes. The identification of retro-DNAs, despite small in number, reveals a new mechanism in propagating the DNA transposons sequences in the primate genomes with the absence of canonical DNA transposon activity. It also suggests that the L1 TPRT machinery may have the ability to retrotranspose a wider variety of DNA sequences than what we currently know.

18
p53 mediated regulation of LINE1 retrotransposon derived R-loops

Paul, P.; Kumar, A.; De, A. K.; Parida, A. S.; Bhadke, G. V.; Khatua, S.; Pattanayak, F.; Tiwari, B.

2024-08-20 cancer biology 10.1101/2024.04.12.589154 medRxiv
Top 0.1%
22.6%
Show abstract

Long Interspersed Nuclear Element 1 (LINE1/L1) retrotransposons, comprising around 17% of the human genome, typically remain quiescent in healthy somatic cells but become activated in various cancer types. Our recent investigation reveals that p53 silences L1 transposons in human somatic cells, potentially constituting a tumor suppressive pathway. In this study, we demonstrate that p53 silences both L1mRNA-gDNA (cis L1 R-loops) and L1mRNA-cDNA hybrids (trans L1 R-loops) formed during retrotransposition. The activation of L1 transposons by HDAC inhibitors (HDACi) led to accumulation of these cis and trans L1 R-loops in p53-/- cells, which were mitigated by treatment with a reverse transcriptase inhibitor. Furthermore, p53 established re-silencing of hyperactivated L1 transposons induced by HDACi. The p53-mediated restoration of silencing was accompanied by recruiting histone repressive marks specifically H3K9me3 and H3K27me3 and inhibiting the deposition of H3K4me3 and H3K9ac marks at the L1 promoter. This study elucidates a novel role of p53 in regulating the formation of RNA-DNA hybrids, a pivotal intermediate component of retrotransposition, and initiating the suppression of hyperactivated L1 elements. These findings underscore the significance of p53 in preserving genome stability through the regulation of L1-derived R-loops. In BriefThe role of L1 transposon derived L1mRNA-cDNA hybrids; an intermediate product formed during retrotransposition, in DNA damage and inflammation is not clear. Paul et al. reveals that p53 prevents L1cDNA derived RNA-DNA hybrids to control DNA damage and activation of inflammatory genes. The findings also elucidate the role of p53 in initiating the repression of hyperactivated transposons by facilitating the recruitment of epigenetic repressive marks and preventing the deposition of activating marks at L1-5UTR. HighlightsO_LIp53 loss facilitates accumulation of both cis (L1mRNA-gDNA) and trans (L1mRNA-cDNA) forms of L1 R-loops. C_LIO_LIThe youngest, actively retrotransposing full-length L1s contribute to the formation of trans (L1mRNA-cDNA) R-loops. C_LIO_LIp53 aids immediate L1 re-silencing by restoring deposition of epigenetic repressive and inhibition of activating marks. C_LIO_LIReverse transcriptase inhibitor prevents L1 mediated DNA damage. C_LI Subject Categories: L1/LINE1, p53, Retrotransposons, RNA-DNA hybrids, Cis R loops, Trans R-loop, L1/LINE1 Graphical O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=123 SRC="FIGDIR/small/589154v3_ufig1.gif" ALT="Figure 1"> View larger version (37K): org.highwire.dtl.DTLVardef@1e37111org.highwire.dtl.DTLVardef@11430c8org.highwire.dtl.DTLVardef@8e9c0corg.highwire.dtl.DTLVardef@a6e98d_HPS_FORMAT_FIGEXP M_FIG C_FIG

19
Manual versus automatic annotation of transposable elements: case studies in Drosophila melanogaster and Aedes albopictus, balancing accuracy and biological relevance

Carrasco-Valenzuela, T.; Marino, A.; Storer, J. M.; Bonnici, I.; Mazzoni, C. J.; Fontaine, M. C.; Haudry, A.; Boulesteix, M.; Fiston-Lavier, A.-S.

2025-01-12 genomics 10.1101/2025.01.10.632341 medRxiv
Top 0.1%
22.3%
Show abstract

Transposable elements (TEs) play a pivotal role in genome evolution, yet their detection and annotation remain challenging due to the limitations of current methods. Manual curation is considered the gold standard for generating TE libraries, particularly for TE focused studies, although it requires extensive training and time. With the rapid increase in genome assembly publications and the growing need for large-scale comparative analyses, automated software for TE annotation has become indispensable. This study compares manual and automated approaches to TE detection and annotation, focusing on two species: Drosophila melanogaster and Aedes albopictus. In D. melanogaster, a species with a well-annotated TE repertoire and a smaller genome, the differences between manual curation (MCTE) and automated annotation (ATTE) are relatively minor. However, significant differences arise when analysing Ae. albopictus, a species with a larger genome and higher TE diversity. While automated methods identified a greater number of TEs, including many smaller and fragmented elements, manual curation provided more detailed classifications and on average larger consensi. Automated pipelines offer a viable alternative for genome-wide analyses such as TE content estimate, particularly when time and resources are limited. However, caution is advised when interpreting results, as finer details of TE dynamics may be overlooked. This study highlights that the choice of annotation method depends on the intended analysis. Manual curation is more suitable for TE population genomics and studies focusing on recent transposable element activity, while automated methods are appropriate for larger comparative analyses or genome assembly projects. Ultimately, both methods have their strengths and limitations, and understanding the specific features of the genome and repeatome under study is essential for selecting the appropriate approach.

20
Recent horizontal transfer of transposable elements in Drosophila

Pritam, S.; Signor, S.

2025-10-31 evolutionary biology 10.1101/2025.10.30.685650 medRxiv
Top 0.1%
22.2%
Show abstract

Transposable elements are genetic elements also known as "jumping genes" that increase their copy number within a host through various mechanisms of transposition. TEs can also move between species through unknown intermediaries in a process known as horizontal transfer, infecting novel genomes and increasing in copy number. While many individual invasions have been documented, a large dataset of recent horizontal transfer events that will allow us to understand larger more general HT patterns has not been assembled. In this manuscript we used almost 400 dipteran genomes to uncover 637 recent transposon invasions, mostly in Drosophila. The majority of transfers occurred between closely related species, with the cosmopolitan melanogaster group showing the highest recent transfer activity. We even documented a single transposon with 16 recent transfers, many between different Drosophila groups. Using species distance on a phylogenetic tree to measure the distance that transposons travel, we found that DNA transposons transfer between distantly related species much more frequently compared to retrotransposons. This potentially represents a different evolutionary strategy for exploiting naive genomes. Our phylogenetic framework advances the understanding of horizontal transfer dynamics at the species level within Drosophila.